Replicating data over a public network

ABSTRACT

A technique includes causing an agent device to setup a replication partnership between a first storage node and a second storage. Causing the agent device to setup the replication partnership includes configuring a proxy server that is associated with the second storage node to establish a secure communication channel for the replication partnership over a public network. Configuring the proxy server includes storing in the proxy server credentials for authenticating the first storage node to use the secure communication channel; and establishing port translations to be used in the secure communication channel in communicating replication data between the first storage node and the second storage node. Causing the agent device to setup the replication partnership may also include communicating replication partnership information to the second node.

BACKGROUND

A computer network may have a backup and recovery system for purposes ofrestoring data on the network to a prior, consistent state should thedata become corrupted, be overwritten, subject to a viral attack, etc.The backup data may be stored at a different geographic location thanthe source data. For example, backup data for a given group of storagenodes of a computer network may be stored in a geographically remote,cloud-based group, or pod, of storage nodes.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an environment associated with areplication partnership according to an example implementation.

FIG. 2 is an illustration of local and remote port forwarding accordingto an example implementation.

FIG. 3A depicts a flow diagram of a technique performed by a user via abrowser to set up a replication partnership in which replication data iscommunicated over a public network according to an exampleimplementation.

FIG. 3B depicts a flow diagram of a technique performed by an agent toset up the replication partnership according to an exampleimplementation.

FIG. 3C depicts a flow diagram of a technique performed by a proxyserver to initiate the replication partnership according to an exampleimplementation.

FIG. 4 is a flow diagram depicting a technique to setup a proxy serverfor a replication partnership in which replication data is communicatedover a public network according to an example implementation.

FIG. 5 is a schematic diagram of an apparatus that provides a proxy fora storage array of a replication partnership according to an exampleimplementation.

FIG. 6 is an illustration of instructions stored on a non-transitorystorage medium, which are executable by a machine to setup a securenetwork tunnel to communicate replication data between storage nodesaccording to an example implementation.

DETAILED DESCRIPTION

A group of one or multiple storage nodes of a computer network may beconfigured to be a replication partner with a geographically remotegroup of one of multiple storage nodes. Due to this partnership,replication data may be communicated between the replication partners sothat, in general, each replication partner stores the same data. As anexample, a group of storage nodes of a computer network may have areplication partnership with a group, or pod, of cloud-based storagenodes. In this manner, the group of storage nodes of the computernetwork may be a local replication partner, and as data changes on thelocal replication partner, the local replication partner may communicatereplication data to the cloud-based storage nodes, which is the remotereplication partner. In general, the replication data represents changesin the data stored on the local replication partner, so that the datastored on the remote replication partner may be used to restore the dataon the local replication partner to a prior, consistent state.

In this context, a “storage node” refers to an independent unit ofstorage, which contains one or multiple storage devices (flash memorydrive devices, magnetic media drive devices, and so forth) and iscapable of communicating data with another storage node. As a morespecific example, a given storage node may be an independent computersystem containing one or multiple storage devices, a storage areanetwork (SAN), and so forth. Moreover, a given storage node may employblock-based or file-based storage.

Because the replication partners may be disposed at differentgeographical locations, the replication data may be communicated betweenthe replication partners over a secure communication channel of a publicnetwork. In this context, a “secure communication channel” refers to alogical connection that employs some degree of security features forpurposes of preventing unauthorized access to or the reading of the datacommunicated between the replication partners. As an example, the securecommunication channel may involve the encryption of plaintext data toform ciphertext data that is communicated to the communication channeland the decryption of ciphertext data that is received from thecommunication channel. The secure communication channel may beestablished, for example, by a secure channel communication protocol,such as a Secure SHell (SSH) protocol, which establishes a securecommunication channel called an “SSH tunnel,” or “SSH connection”herein. Thus, when two arrays are geographically apart, replicationhappens over a public network, and the replication has security for dataexchange for purposes of preventing malicious attempts at reading thedata. Because setting up replication over a public network may involveopening up network ports in the infrastructure or poking a hole in theprivate network, securing the network infrastructure at two endpoints ofreplication may be beneficial.

One way to securely communicate replication data over a public networkis to delegate the security of the replication data transfer and thesecurity of the network infrastructure to specific network devices, suchas firewall or virtual private network (VPN) devices. These networkdevices may not, however, be part of either storage array system. Ifpotential replication partners are not owned by the same entity and willinvolve communicating the replication data over a public network, it maybe challenging to set up such network devices with the appropriate keys,credentials, and so forth.

In accordance with example implementations that are described herein, aweb portal and a proxy server (e.g., an “SSH proxy” in accordance withexample implementations) are used to setup and manage a replicationpartnership between a first replication partner (called a “localreplication partner” herein) and a geographically remote secondreplication partner (called a “remote replication partner” herein) overa public network. As an example, the local replication partner maycontain one or multiple storage nodes, such as storage node(s) of acomputer network; and a backup and recovery solution for the computernetwork may include forming a replication partnership between thesestorage node(s) of the computer network and a group, or pod, of one ormultiple cloud-based storage node(s) (i.e., the remote replicationpartner).

In general, a user associated with a computer network may access the webportal through a browser (an application that allows accessinginformation from the Internet) that executes on the user's computer. Theweb portal may be provided by an agent (a cloud-based server, forexample), for the local replication partner. In general, the agentorchestrates setting up the replication partnership and the agent/webportal may be owned by a different entity than the entity that owns thecomputer network. The user may provide input (via keystrokes, mouseclicks, touch screen gestures, and so forth) to the web portal forpurposes of requesting a new replication partnership. In this manner,the input provided to the web portal may describe the overall setup forthe replication partnership. In this manner, the “overall setup” for thereplication partnership includes generally setting up criteria for thereplication partnership, including setting up an identifier used in thefuture to identify the replication partnership, identifying the one ormore storage nodes to form the “local partner” of the replicationpartnership, the identifier(s) of the local storage nodes of the localreplication partner, keys used by the local replication partner, and soforth. Moreover, the user may provide input to the web portal toidentify (storage tier, storage size, geographical area, and so forth)for selecting the storage node(s) that will form the remote replicationpartner. Moreover, in accordance with example implementations, the usermay provide input to the web portal, which identifies one or multiplestorage nodes of the computer network that are to form the localreplication partner; and the browser (via execution of a script, forexample) retrieves information, such as SSH keys and storage nodeidentifications (IDs), for example, from the storage node(s) of thelocal replication partner and sends this information to the web portal.The web portal may then use the information to configure the replicationpartnership, as further described herein.

In general, an “SSH key” refers to a key that is used to identify anetwork entity to an SSH server using public key cryptography andchallenge-response authentication, pursuant to the SSH protocol.

In accordance with example implementations, based on the criteriaprovided by the user, the agent selects the storage node(s) that formthe remote replication partner. In accordance with exampleimplementations, selection involves selecting a particular data centerassociated with the user's selected geographical area and selecting agroup of one or multiple storage nodes of the selected datacenter.

In accordance with example implementations, the remote replicationpartner is associated with a proxy server (a proxy server of theselected data center, for example). The proxy server serves as a networkendpoint for replication partnerships involving storage nodes of thedata center. It is noted that the datacenter may include multiple suchproxy servers.

In general, the agent communicates with the proxy server to configurethe remote replication partner for the replication partnership andconfigure the proxy server for the secure communication channel to beused to communicate the replication data for the replicationpartnership. In accordance with some implementations, the agent maycommunicate with the proxy server by issuing commands to the proxyserver via a remote shell (RSH). In accordance with further exampleimplementations, the agent may communicate with the proxy server usingREpresentational State Transfer (REST) application programming interface(API) calls, which result in HyperText Transfer Protocol (HTTP)requests.

In accordance with example implementations, the agent communicates withthe proxy server to store credentials of the local replication partnerin the proxy server so that when the local replication partner initiatesthe secure communication channel with the proxy server, the proxy servercan authenticate the local replication partner. Moreover, the agentcommunicates with the proxy server to set up port translations (or“mappings”) that are used in connection with the secure communicationchannel. In this context, the port translations refer to port forwardingperformed by both network endpoints of the secure communication channel,as further described herein.

In accordance with example implementations, after the agent communicateswith the proxy server (of the remote replication partner) to set up thereplication partnership and set up the associated secure communicationchannel, the agent may then communicate data to the local replicationpartner pertaining to details about the replication partnership andsecure communication channel. In this manner, the agent may communicatereplication partnership login identification for the local replicationpartner, port forwarding details, and so forth. The local replicationpartner may thereafter initiate communication with the proxy server(e.g., an SSH proxy, in accordance with example implementations) forpurposes of creating the secure communication channel (e.g., an SSHconnection, or SSH tunnel); and after the secure communication channelis created, the local replication partner may communicate with remotereplication partner for purposes of transferring replication data overthe secure communication channel.

Due to the techniques and systems that are described herein, informationmay be readily, easily and securely exchanged between storage nodes forpurposes of setting up a replication partnership between the nodes, evenwhen the storage nodes are owned by different entities. Moreover, thestorage nodes do not need associated special devices, such as virtualprivate network (VPN) devices, firewalls, and so forth.

FIG. 1 depicts an example environment 100 for setting up a replicationpartnership according to an example implementation. The replicationpartnership to be set up for this example includes a first replicationpartner 108 (also called the “local replication partner” herein) and asecond replication partner 120 (also called the “remote replicationpartner” herein). The local replication partner 108 includes one ormultiple storage nodes 110, and the remote replication partner 120includes one or multiple storage nodes 121. As an example, a storagenode 110 or 121 may be a storage array (an array in a storage areanetwork (SAN), for example).

As a more specific example, the storage node(s) 110 may be at adifferent geographical location than the storage node(s) 121. Inaccordance with some implementations, the storage node(s) 121 may bepart of a data center 149, and the data center 149 may be one ofmultiple data centers 149 that provide cloud-based data storage and arelocated at different geographical locations. For example, in thismanner, for the United States, one or multiple data centers 149 may beassociated with an East coast location, one or multiple data centers 149may be associated with a West coast location, and so forth.

It is noted that FIG. 1 depicts an example group, or pod, of storagenodes 121 of the particular data center 149-1. The data center 149-1 maycontain additional storage nodes, which may be associated with differentstorage pods and/or different replication partnerships. The storagenodes 121 that are depicted in FIG. 1 may be associated with more thanone replication partnership. Moreover, a computer network containing thestorage nodes 110 for the local replication partner 110 have additionalstorage nodes that are not associated with the replication partnershipdescribed herein, and the storage nodes 110 depicted in FIG. 1 may beassociated with other replication partnerships.

As also depicted in FIG. 1, for the replication partnership describedherein, a secure communication channel, such as an SSH tunnel 130, is tobe used to communicate the replication data between the replicationpartners 108 and 120. In general, replication data may be communicatedbetween the replication partners 108 and 120 in either direction acrossthe SSH tunnel 130.

In general, the storage nodes 110 may, in accordance with exampleimplementations, be associated with a private network (not illustrated).In this manner, in general, the storage nodes 110 may not have addressesthat are accessible via public Internet Protocol (IP) addresses.

As a more specific example, in accordance with some implementations, thecomputer 170 may execute machine executable instructions to provide anInternet browser 174. Using the browser 174, the user may, via publicnetwork fabric 129, access a web portal 182, which is an Internet-basedinterface that is provided by an agent 180 (an Internet server, forexample). The connection to the web portal 182 may be through aHypertext Transport Protocol Secure (HTTPS) session, for example. Afterproviding the appropriate login credentials, the user may access a pageof the web portal 182 for purposes of creating, or setting up, theremote replication partner. Using the access to the web portal 182, theuser may enter a name for the remote replication partner to be created,data (through dialog boxes, for example) to select a particulargeographic region (e.g., East Coast, Midwest, Southwest, West Coast andso forth) for the remote replication partner and other criteria to beconsidered for purposes of selecting the remote replication partner,such as the storage tier, and the amount, or capacity, of data storage.Based on these parameters, the agent 180 may select a particular datacenter 149 (data center 149-1 for the example depicted in FIG. 1) and agroup of one or multiple storage nodes 121 of the selected data center149.

In accordance with example implementations, a user (a networkadministrator, for example) who is affiliated with the computer networkcontaining the storage nodes 110 may initiate the creation of areplication partnership with the nodes 110. In this manner, inaccordance with example implementations, the user may provide user input176 (input derived from keystrokes, mouse interaction with a graphicaluser interface (GUI), touch gestures and so forth) with a computer 170for purposes of setting up the replication partnership, identifying thestorage nodes 110 that form the local replication partner 108 andidentifying criteria for the remote replication partner 120, such as thename of the remote replication partner 120. The setting up of thereplication partnership may, for example, involve the user using thecomputer 170 to access the storage nodes 110 for purposes of retrievingdata from the storage nodes 110 pertaining to credentials andidentifications of the storage nodes 110.

In this manner, the user may, through a dialog box, enter a name of theremote replication partner. The browser 174 may then execute a script tocause the browser 174 to, through an HTTPS session with the storagenode(s) 110, retrieve credentials (SSH keys, for example) from thestorage nodes 110. In accordance with some implementations, access ofthe computer 170 to the storage nodes 110 may be through the use ofprivate network fabric (not illustrated). The browser 174 then maycommunicate with agent 180 via HTTPS session, providing details andcredentials of storage node(s) 110 and request to create a replicationpartnership between local storage node(s) 110 and storage node(s) 121that are selected to form the remote replication partner 120.

The agent 180 may then communicate with a proxy server 152 for the datacenter 149-1 via the public network fabric 129. In this manner, asfurther described herein, the agent 180 may communicate with the proxyserver 152 for purposes of transferring credentials for the storagenodes 110 to the proxy server 152 and configuring port forwarding thatis used with the SSH tunnel 130. The agent 180, may then communicatereplication partnership details and SSH tunnel credentials (an SSH keyand an SSH user name, for example) to the browser 174, which may thencommunicate this information to the local replication partner 108 andcause the local replication partner 108 to initiate the SSH connectionwith the proxy server 152.

In accordance with some implementations, one of the storage nodes 110 ofthe local replication partner 108, such as storage node 110-1, is amanager for the group of storage nodes 110 (i.e., initiates and drivesthe replication for the group of storage nodes 110-1) and serves as anSSH tunnel endpoint 119 for the local replication partner 108. On theother end of the SSH tunnel 130, the proxy server 152 serves as an SSHtunnel endpoint 153 for the remote replication partner 120; and one ofthe storage nodes 121, such as storage node 121-1, is the manager forthe group of storage nodes 121 of the remote replication partner 120.

As depicted in FIG. 1, the proxy server 152 may communicate with thestorage nodes 121 via private network fabric 157. In accordance withexample implementations, due to local and remote port forwardingassociated with communications over the SSH tunnel 130 (as describedherein), the storage nodes 110 may open a single SSH port for the SSHtunnel 130, and likewise, the proxy server 152 may open a single SSHport.

As noted above, in accordance with example implementations, the browser174 may perform the functions described herein through the execution ofa program, or script. In this regard, the browser 174 as well as theprogram executed by the browser 174 may be formed through machineexecutable instructions (i.e., “software”) that are stored in a memory179 of the computer 170 and are executed by one or multiple hardwareprocessors 177. In general, the memory 179 may be formed from anon-transitory storage medium, such as a storage medium formed from oneor multiple semiconductor storage devices, magnetic storage devices,memristors, phase change memory devices, flash memory devices, volatilememory devices, non-volatile memory devices, a combination of storagedevices formed from one or more of the foregoing or other storagedevices, and so forth. The processor(s) 177 may be, as examples, one ormultiple Central Processing Units (CPUs), one or multiple CPU processingcores, and so forth.

The agent 180 may contain one or multiple hardware processors 184 and amemory 186 that stores instructions that, when executed by one or moreof the processors 184, cause the processor(s) 184 to perform one or morefunctions of the agent 180, which are described herein. In a similarmanner, the proxy server 152 may contain one or multiple hardwareprocessors 154 and a memory 156 that stores instructions that, whenexecuted by the processor(s) 154, cause the processor(s) 154 to performone or more of the functions of the proxy server 152 which are describedherein. It is noted that the memories 156 and 186 may be non-transitorymemories and may contain one or more storage devices, similar to thememory 179. Moreover, the processors 154 and 184 may be processorssimilar to the processors 177.

In accordance with example implementations, each storage node 110 mayinclude one or multiple storage devices 112 (e.g., hard disk drives(HDDs), magnetic storage drives, solid state drives (SSDs), flash memorydrives, and so forth, or a combination thereof). Moreover, a storagenode 110 may contain one or multiple processors 114 and a non-transitorymemory 113 that stores instructions that, when executed by the processor115, cause the processor 115 to perform one or more functions of thestorage node 110 described herein. In particular, in accordance withsome implementations, the execution of instructions by the processor(s)114 may cause the processor(s) 115 to form background processes, ordaemons, such as a group management engine 117.

The group management engine 117, in accordance with exampleimplementations, controls the actions of the manager node 110 (such asnode 110-1) for the group and more specifically controls the replicationmanagement services for the replication partner 108. In general, thegroup management engine 117 initiates and drives its associatedreplication group (i.e., the local replication partner 108) andcommunications with the remote replication partner 120. Each storagenode 110 may also contain a data services engine 115, another daemon, inaccordance with example implementations. Thus, in accordance withexample implementations, if the local replication partner 108 has Nstorage nodes 110, then there are N instances of the data servicesengine 115 and one instance of the data management engine 117. Ingeneral, the data services engine 115 is responsible for data movementbetween the two partnership groups for purposes of transferringreplication data between the groups. The data services engines 115 ofthe local replication partner 108 communicate data with correspondingdata services engines 127 of the storage nodes 121 of the remotereplication partner 120. Similar to the storage nodes 110, in accordancewith example implementations, a single storage node 121 may contain asingle group management engine 125 for purposes of providing replicationmanagement services for the particular replication group, and eachstorage node 121 of the replication group may contain a data servicesengine 127. Similar to the data services engine 115 and the generalmanagement engine 117, the data services engine 127 and the generalmanagement engine 125 may be background processes, or daemons, formed bythe execution of machine executable instructions that are stored in anon-transitory memory 123 and executed by one or multiple hardwareprocessors 124. Each storage node 121 may include one or multiplestorage devices 122 (e.g., hard disk drives (HDDs), magnetic storagedrives, solid state drives (SSDs), flash memory drives, and so forth, ora combination thereof).

In accordance with example implementations, the public network fabric129 and the private network fabric 127 may include any type of wired orwireless communication network, including cellular networks (e.g.,Global System for Mobile Communications (GSM), 3G, Long Term Evolution(LTE), Worldwide Interoperability for Microwave Access (WiMAX), etc.),digital subscriber line (DSL) networks, cable networks (e.g., coaxialnetworks, fiber networks, etc.), telephony networks, local area networks(LANs) or wide area networks (WANs), or any combination thereof. Thepublic network fabric 129 may include any of the foregoing networks, aswell as global networks (e.g., network fabric communicating Internettraffic) or any combination thereof.

FIG. 2 is an illustration 200 of local and remote port forwarding usedin connection with the SSH tunnel 130 in accordance with exampleimplementations. In particular, for this example, the storage node110-1, the manager node for the local replication partner 108, has aprivate, internal Internet Protocol (IP) address of 192.168.10.1, andother storage nodes 110 (such as storage node 110-2) have othercorresponding private, internal IP addresses, such as address192.168.10.2 for the storage node 110-2. For the remote replicationpartner 120, the proxy server 152 has a public IP address 50.0.0.1 and acorresponding private, internal IP address 60.0.0.1. Moreover, similarto the storage nodes 110, the storage nodes 121 have correspondingprivate, internal IP addresses, such as internal IP address 60.0.0.2 forthe storage node 121-1 which is the manager node for the remotereplication partner 120. In a similar manner, other storage nodes 121may have other corresponding private, internal IP addresses.

For purposes of the general management engines 117 and 125 communicatingwith each other, the SSH tunnel endpoint 153 provided by the proxyserver 152 and a corresponding SSH tunnel endpoint 119 provided by thestorage node 110-1 perform port translations. In this manner, inaccordance with some implementations, when the agent 180 configures theproxy server 152 with the port translations, the agent 180 may requestport translations similar to the following example:

ssh<user>@50.0.0.1-L 10000:60.0.0.2:4213-L 10001:60.0.0.2:4214-R50000:192.168.10.1:4213-R 50001:192.168.1:4214-R 50002:192.168.10.2:4214

The above example sets forth local port forwarding and reverse, orremote, port forwarding translations for the public IP address 50.0.0.1of the proxy server 152. The delimiter “-L” signifies a local portforwarding translation immediately preceding the “-L” delimiter; and the“-R” delimiter signifies a remote port forwarding translationimmediately following the delimiter. For example, the first local portforwarding translation “10000:60.0.0.2:4213” represents that an incomingcommunication from the SSH tunnel 130 (to the remote replication partner120) directed to port 10000 is to be redirected by the proxy server 152to port 4213 at internal IP address 60.0.0.2 (i.e., the address/port ofthe general management engine 125 of the storage node 121-1). In words,the general management engine 117 sends communications to the generalmanagement engine 125 to port 10000, and the SSH tunnel endpoint 153directs these communications to the private IP address/port of thegeneral management engine 125.

As another example, the local port forwarding translation“10001:60.0.0.2:4214” in the example expression above represents anotherlocal port translation for the SSH tunnel endpoint 153 in which theendpoint 153 redirects traffic directed to port 10001 to be internal IPaddress 60.0.0.2:4214, which is the address/port of the data servicesengine 127 of the storage node 121-1. In words, a data services engine115 of the local replication partner 108 may send data to a dataservices engine 127 of the storage node 121-1 using port 10001, and theSSH tunnel endpoint 153 directs this data to the appropriate private IPaddress/port of the storage node 121-1 assigned to the data servicesengine 127.

The example expression above also sets forth remote port forwardingtranslations, which are handled by the SSH tunnel endpoint 119 of thestorage node 110-1. In this manner, in accordance with exampleimplementations, the proxy server 152 is configured with the remote portforwarding; and when the storage node 110-1 initiates the SSHconnection, the proxy server 152 sets up the SSH tunnel endpoint 119 forthe remote port forwarding. As an example, the remote port forwarding“50000:192.168.10.1:4213” represents that the SSH tunnel endpoint 119translates incoming traffic from the SSH tunnel 130 to internal IPaddress 192.168.10.1:4213, which is the address/port of the generalmanagement engine 117. Likewise, the remote port forwarding set forthabove sets forth remote port forwarding for the data services engine 115of the storage node 110-2 and the data services engine 115 of thestorage node 110-1.

Thus, as depicted at reference numeral 210 of FIG. 2, the generalmanagement engine 117 of the storage node 110-1 may access the generalmanagement engine 121-1 using partner address 192.168.10.1:10000, andthe data service engines 115 of the storage nodes 110 may access thedata services engines 127 of the storage nodes 121 using partner address192.168.10.1:10001. On the other end of the SSH tunnel 130, the generalmanagement engine 125 of the storage node 121-1 may access the generalmanagement engine 117 of the storage node 110-1 using the address60.0.0.1:50000; the data services engine 127 of the storage node 121-1may communicate with the data services engine 115 of the storage node110-1 using the partner address 60.0.0.1:50001; and the data servicesengine 127 may communicate with the data services engine 115 of thestorage node 110-2 using the partner address 60.0.0.1:50002.

FIGS. 3A, 3B and 3C depict example techniques 300, 320, and 332,respectively, which may be used, in accordance with someimplementations, to setup and initiate a replication partnership andassociated secure communication channel. The example technique 300 maybe performed, for example, by a user via a browser, such as the browser174 described above. The example technique 320 may be performed, forexample, by an agent, such as the agent 180 described above, in responseto messages received from the browser. The example technique 332 may beperformed, for example, by a managing node of a local replicationpartner, such as the managing node 110-1 described above, in response tomessages received from the agent 180.

Referring to FIG. 3A in conjunction with FIG. 1, the technique 300includes, pursuant to block 304, the user opening the browser 174 andusing the browser 174 to access the web portal 184 via an HTTPS sessionand access the storage node(s) 110 (i.e., the nodes for the localreplication partner) via an HTTPS session.

Pursuant to block 308, the user may then navigate to a page of the webportal 182 to configure the replication partnership, including namingthe partnership, selecting a geographic region for the replicationpartner, selecting a capacity for the replication partner and selectinga storage tier. The user may navigate, as depicted in block 312, to apage provided by the storage node(s) 110 (a page provided by themanaging storage node 110, for ex example) to provide input to add theremote replication partner as the replication endpoint.

Pursuant to block 314, the technique 300 includes the user retrieving,via the browser 174, the SSH keys and identifiers of the storage node(s)110 of the local replication partner 110 and providing, via the browser174, the SSH keys and identifiers to the agent 180.

Referring to FIG. 3B and the technique 320 in conjunction with FIG. 1,pursuant to block 322, the agent 180 may, in response to receivingconfiguration information from a user, select a particular data center149 and group of one or multiple storage nodes 121 to form a replicationpartner 120. Pursuant to block 324, the agent 180 may communicate withthe proxy server 152 and the selected storage node(s) 121 of theselected datacenter 149 to orchestrate the setting up of the replicationpartnership 120 and the SSH tunnel 130. For example, this orchestrationmay include the agent 180 copying the SSH keys of the storage nodes 110to the proxy server 152; and setting up a user ID for the managingstorage node 110-1 in the proxy server 152. Here, the user ID may be anID that is used for purposes of authenticating with the proxy server 152and setting up the SSH tunnel 130. The orchestration may further includeconfiguring the proxy server 152 to set up local and remote portforwarding for the SSH tunnel 130; and copying replication partnershipcredentials to the storage node(s) 121 of the remote replication partner120. In accordance with example implementations, the replicationpartnership credentials are different from the SSH tunnel credentialsand include, for example, a shared secret and a replication partner IDof the managing storage node 110-1. In accordance with exampleimplementations, the above-described actions may be performed by one ormultiple hardware processors of the agent 180 executing machineexecutable instructions to make calls to a REST API (not shown) of theagent 180.

Pursuant to block 326, the agent 180 communicates replicationpartnership details to the local replication partner 108, where thereplication partnership details include: the SSH tunnel 130 details,including the user ID for the storage node 110-1 and the public IPaddress of the proxy server 152; SSH tunnel 130 port forwarding details;and replication partner 120 details corresponding to the storage node(s)121.

The technique 320 described above may be embodied (in whole or in part)in machine readable instructions that, when executed by a processor ofthe agent 180, cause the agent 180 to perform (some or all of) theoperations of the example technique 320. The machine readableinstructions may be stored on a non-transitory storage medium, which mayinclude volatile media such as random-access-memory (RAM) (e.g., DRAM,SRAM, etc.) and/or persistent (non-volatile) media such as non-volatilememory (e.g., PROM, EPROM, EEPROM, NVRAM, etc.), flash drives, hard diskdrives, optical disks, etc.

Referring to FIG. 3C and the technique 332 in conjunction with FIG. 1,in block 334, the managing node 110-1 of the local replication partner108 may, in response to receiving replication partnership details fromthe agent 180, initiate (block 334) the SSH connection using the publicIP address of the proxy server 152 and authenticate itself with theproxy server 152, pursuant to block 336. In accordance with block 338,after the SSH connection is set up, the managing storage node 110-1 maythen configure the remote port forwarding to the storage node(s) 121 ofthe remote replication partner 120.

The technique 332 described above may be embodied (in whole or in part)in machine readable instructions that, when executed by a processor ofthe managing node 121-1, cause the managing node 121-1 to perform (someor all of) the operations of the example technique 330. The machinereadable instructions may be stored on a non-transitory storage medium,which may include volatile media such as random-access-memory (RAM)(e.g., DRAM, SRAM, etc.) and/or persistent (non-volatile) media such asnon-volatile memory (e.g., PROM, EPROM, EEPROM, NVRAM, etc.), flashdrives, hard disk drives, optical disks, etc.

Referring to FIG. 4 thus, in accordance with example implementations, atechnique 400 may include causing (block 404) an agent device to set upa replication partnership between a first storage node and a secondstorage. Causing the agent device to setup the replication partnershipmay include configuring a proxy server that is associated with thesecond storage node to establish a secure communication channel for thereplication partnership over a public network. Configuring the proxyserver may include storing in the proxy server credentials forauthenticating the first storage node to use the secure communicationchannel; and establishing port translations to be used in the securecommunication channel in communicating replication data between thefirst storage node and the second storage node. The technique 400 mayinclude communicating replication partnership information to the secondnode, pursuant to block 408.

Referring to FIG. 5, in accordance with example implementations, anapparatus 500 incudes at least one processor 512 and a memory 504. Thememory 504 stores instructions 508 that, when executed by theprocessor(s) 512, cause the processor(s) 512 to set up a replicationpartnership between a first storage node and a second storage node. Morespecifically, in accordance with example implementations, theinstructions 508, when executed by the processor(s) 512, cause theprocessor(s) 512 to store in a proxy server credentials forauthenticating the first storage node to use a secure communicationchannel over a public network for the replication partnership; andestablish port translations to be used in the secure communicationchannel to communicate replication data between the first storage nodeand the second storage node. The instructions 508, when executed by theprocessor(s) 512, cause the processor(s) 512 to communicate replicationpartnership information to the second storage node.

The memory 504 may include any non-transitory storage medium, which mayinclude volatile media, such as random-access-memory (RAM) (e.g., DRAM,SRAM, etc.) and/or persistent (non-volatile) media such as non-volatilememory (e.g., PROM, EPROM, EEPROM, NVRAM, etc.), flash drives, hard diskdrives, optical disks, etc.

Referring to FIG. 6, in accordance with example implementations, anon-transitory storage medium 600 stores machine executable instructions610. In some examples, the instructions 610 may, when executed by amachine (a processor-based machine, for example), form an agent, such asthe agent 180, that is to orchestrate establishment of a replicationpartnership between a local replication partner and a remote replicationpartner. For example, the instructions 610 may be such that, when theyare executed by a machine, they cause the machine to provide aninterface to receive input representing a credential associated with afirst storage node and input representing criteria to select areplication partner storage node for the first storage node; access aproxy server for the replication partner storage node; communicate datarepresenting the credential to the proxy server; and communicate withthe proxy server to set up port forwarding for a future secure networktunnel to communicate replication data between the first storage nodeand the replication partner storage node, where the proxy server formsan endpoint of the secure network tunnel. The first storage node formsanother endpoint of the secure network tunnel.

While the present disclosure has been described with respect to alimited number of implementations, those skilled in the art, having thebenefit of this disclosure, will appreciate numerous modifications andvariations therefrom. It is intended that the appended claims cover allsuch modifications and variations. All of the features disclosed in thisspecification (including any accompanying claims, abstract anddrawings), and/or all of the elements of any method or process sodisclosed, may be combined in any combination, except combinations whereat least some of such features and/or elements are mutually exclusive.

What is claimed is:
 1. A method comprising: causing an agent device toset up a replication partnership between a first storage node and asecond storage node, wherein causing the agent device to setup thereplication partnership comprises: configuring a proxy server that isassociated with the second storage node to establish a securecommunication channel for the replication partnership over a publicnetwork, wherein configuring the proxy server comprises: storing in theproxy server credentials for authenticating the first storage node touse the secure communication channel; and establishing port translationsto be used in the secure communication channel in communicatingreplication data between the first storage node and the second storagenode; and communicating replication partnership information to thesecond storage node.
 2. The method of claim 1, wherein the proxy serverand the second storage node comprise part of a private network; theprivate network comprises a plurality of storage nodes, including thesecond storage node; and configuring the proxy server further comprisesselecting the second storage node from among the plurality of storagenodes.
 3. The method of claim 1, wherein establishing the porttranslations comprises configuring local tunnel and reverse tunnel porttranslations associated with a public Internet Protocol (IP) address ofthe proxy server.
 4. The method of claim 1, wherein configuring theproxy server comprises configuring the proxy server to communicate witha tunnel endpoint associated with the first storage node.
 5. The methodof claim 1, further comprising: providing a portal accessible through apublic network to receive data representing the credentials of the firststorage node.
 6. The method of claim 5, further comprising: using theportal to receive input identifying a geographic region for areplication partner for the first storage node; selecting the secondstorage node based on the identified geographic region; andcommunicating an identifier to the proxy server, wherein the identifieridentifies the second storage node.
 7. The method of claim 6, furthercomprising: further basing selection of the second storage node on inputidentifying a storage tier associated with the replication partnership.8. The method of claim 1, wherein configuring the proxy server furthercomprises configuring the proxy server to select one of the firststorage node and the second storage node to be a replication source or areplication target.
 9. The method of claim 1, wherein configuring theproxy server further comprises: communicating a key associated with thefirst storage node to the proxy server.
 10. The method of claim 1,wherein configuring the proxy server further comprises communicatingdata to the proxy server representing a replication partnershipidentification associated with the first storage node and a replicationpartnership credential associated with the first storage node.
 11. Themethod of claim 1, wherein configuring the proxy server furthercomprises communicating data representing an identification of the firststorage node.
 12. An apparatus comprising: at least one processor; and amemory that stores instructions that, when executed by the at least oneprocessor, cause the at least one processor to: store in a proxy servercredentials for authenticating a first storage node to use a securecommunication channel, wherein the first storage node is associated witha replication partnership with a second storage node; establish porttranslations to be used in the secure communication channel tocommunicate replication data between the first storage node and thesecond storage node; and communicate replication partnership informationto the second storage node.
 13. The apparatus of claim 12, wherein theinstructions, when executed by the at least one processor, cause the atleast one processor to: provide a portal accessible through a publicnetwork to receive data representing a credential of the first storagenode.
 14. The apparatus of claim 13, wherein the instructions, whenexecuted by the at least one processor, cause the at least one processorto: use the portal to receive input identifying a geographic region fora replication partner for the first storage node; select the secondstorage node based on the identified geographic region; and communicatean identifier to the proxy server, wherein the identifier identifies thesecond storage node.
 15. The apparatus of claim 12, wherein theinstructions, when executed by the at least one processor, cause the atleast one processor to configure the proxy server to set up a networktunnel.
 16. The apparatus of claim 15, wherein the network tunnelcomprises a Secure Shell (SSH) tunnel.
 17. A non-transitory storagemedium storing instructions that, when executed by a machine, cause themachine to: provide an interface to receive input representing acredential associated with a first storage node and input representingcriteria to select a replication partner storage node for the firststorage node; access a proxy server for the replication partner storagenode; communicate data representing the credential to the proxy server;and communicate with the proxy server to set up port forwarding for afuture secure network tunnel to communicate replication data between thefirst storage node and the replication partner storage node, wherein theproxy server forms an endpoint of the secure network tunnel and thefirst storage node forms another endpoint of the secure network tunnel.18. The non-transitory storage medium of claim 17, wherein theinstructions, when executed by the machine, cause the machine to:provide access to the interface via a public network.
 19. Thenon-transitory storage medium of claim 17, wherein the instructions,when executed by the machine, cause the machine to communicate with theproxy server to reserve a public network port of the proxy server andmap the public network port to a private network port of the replicationpartner storage node.
 20. The non-transitory storage medium of claim 17,wherein the instructions, when executed by the machine, cause themachine to select the replication partner based on a selection criteria.